Deep Visual Semantic Embedding with Text Data Augmentation and Word Embedding Initialization
نویسندگان
چکیده
منابع مشابه
An Exploration of Word Embedding Initialization in Deep-Learning Tasks
Word embeddings are the interface between the world of discrete units of text processing and the continuous, differentiable world of neural networks. In this work, we examine various random and pretrained initialization methods for embeddings used in deep networks and their effect on the performance on four NLP tasks with both recurrent and convolutional architectures. We confirm that pretraine...
متن کاملDeViSE: A Deep Visual-Semantic Embedding Model
Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources – such as text data – both to train visual models and to con...
متن کاملSemantic Word Embedding (SWE)
......................................................................................................................... 2 Chapter 1 Semantic Word Embedding ........................................................................ 3 1.1 The Skip-gram moel ..................................................................................... 3 1.2 SWE as Constrained Optimization ....................
متن کاملDeep Semantic Embedding
We introduce Deep Semantic Embedding (DSE), a supervised learning algorithm which computes semantic representation for text documents by respecting their similarity to a given query. Unlike other methods that use singlelayer learning machines, DSE maps word inputs into a lowdimensional semantic space with deep neural network, and achieves a highly nonlinear embedding to model the human percepti...
متن کاملFinding beans in burgers: Deep semantic-visual embedding with localization
Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships. Such a multi-modal embedding can be trained and used for various tasks, notably image captioning. In the present work, we introduce a new architecture of this type, with a visual path that leverages recen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematical Problems in Engineering
سال: 2021
ISSN: 1563-5147,1024-123X
DOI: 10.1155/2021/6654071